A new scheme for unconstrained handwritten text-line segmentation

نویسندگان

  • Alireza Alaei
  • Umapada Pal
  • P. Nagabhushan
چکیده

Variations in inter-line gaps and skewed or curled text-lines are some of the challenging issues in segmentation of handwritten text-lines. Moreover, overlapping and touching text-lines that frequently appear in unconstrained handwritten text documents significantly increase segmentation complexities. In this paper, we propose a novel approach for unconstrained handwritten text-line segmentation. A new painting technique is employed to smear the foreground portion of the document image. The painting technique enhances the separability between the foreground and background portions enabling easy detection of text-lines. A dilation operation is employed on the foreground portion of the painted image to obtain a single component for each text-line. Thinning of the background portion of the dilated image and subsequently some trimming operations are performed to obtain a number of separating lines, called candidate line separators. By using the starting and ending points of the candidate line separators and analyzing the distances among them, related candidate line separators are connected to obtain segmented text-lines. Furthermore, the problems of overlapping and touching components are addressed using some novel techniques. We tested the proposed scheme on text-pages of English, French, German, Greek, Persian, Oriya, and Bangla and remarkable results were obtained.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance of Statistics Based Line Segmentation System for Unconstrained Handwritten Text

Handwritten character recognition is a technique by which a computer system could recognize characters and other symbols written in natural handwriting. Segmentation decomposes the document image into subcomponents like lines, words and characters. To achieve greater accuracy, segmentation and recognition could not be treated independently. Most of the existing line segmentation methods have li...

متن کامل

Morphology Based Handwritten Line Segmentation Using Foreground and Background Information

Currently text line segmentation is an important stage of research in historical document processing. Because of inter-line distance variability and base-line skew variability, line segmentation in unconstrained handwritten document is very difficult. The line segmentation task gets complicated, when overlapping or inter-penetration situation occurs between two consecutive text lines. In this p...

متن کامل

Review: A Literature Survey on Text Segmentation in Handwritten Punjabi Documents

Gurumukhi script is used for Punjabi language, which is a two dimensional composition of symbols with connected and disconnected diacritics. Handwritten Gurumukhi script has some complexities like connected, overlapped text lines, words and characters. It is one of the foremost issues for errors during the recognition process. Text segmentation is a challenging job in unconstrained writer indep...

متن کامل

External word segmentation of off-line handwritten text lines

This paper describes techniques to separate a line of unconstrained (written in a natural manner) handwritten text into words. When the writing style is unconstrained, recognition of individual components may be unreliable so they must be grouped together into word hypotheses, before recognition algorithms (which may require dictionaries) can be used. Our system uses original algorithms to dete...

متن کامل

Robust Segmentation of Unconstrained Online Handwritten Documents

A segmentation algorithm, which can detect different regions of a handwritten document such as text lines, tables and sketches will be extremely useful in a variety of applications such as retrieval, translation and genre classification. However, this task is extremely challenging for handwritten documents, which vary considerably in their structure and content. In this paper, we describe a rob...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pattern Recognition

دوره 44  شماره 

صفحات  -

تاریخ انتشار 2011